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ABSTRACT 


Compound  and  generalized  distributions  have  been  discussed  in 
the  framework  of  contagious  distributions.  In  particular,  it  is  pointed  out 
that  the  Negative  Binomial  may  be  regarded  as  a  compound  Poisson  (using  a 
Gamma  variable  as  the  compounder)  or  as  a  generalized  Poisson  (using  a 
Logarithmic  random  variable  as  the  generalizer) .  As  an  example  of  true 
contagion  the  Negative  Binomial  is  also  a  limit  of  the  distribution  obtained 
through  Polya's  urn  model. 

A  formal  relation  between  compound  and  generalized  distributions 
is  developed,  utilizing  a  symbolic  notation.  Some  natural  extensions  of 
the  Negative  Binomial  through  repeated  compounding  with  a  Gamma 
distribution  or  through  repeated  generalizing  with  a  Logarithmic  distribution 
are  indicated. 

Some  wide  generalizations  of  Neyman's  class  of  contagious 
distributions  are  presented,  and  examination  of  their  shape  reveals 
that  some  simpler  families  with  fewer  parameters,  such  as  the  Poisson 
V  Pascal  offer  interesting  possibilities  for  fitting  data.  An  attractive 
property  of  the  Poisson  v  Pascal  is  that  it  contains  the  Negative 
Binomial,  Neyman  Type  A,  and  Poisson  as  special  limiting  cases. 


SOME  FAMILIES  OF  COMPOUND  AND  GENERALIZED  DISTRIBUTIONS 


John  Gurland 


1.  Introduction 

Compound  and  generalized  distributions  arise  in  the  study  of  so-called 
contagious  distributions.  Feller  (1943)  described  two  types  of  contagion.  One 
of  these  types,  "true  contagion",  pertains  to  situations  in  which  each  "favorable" 
event  increases  (or  decreases)  the  probability  of  succeeding  favorable  events. 

The  other  of  these  types,  "apparent  contagion",  reflects  a  sort  of  heterogeneity 
of  the  population.  Still  a  further  type  of  contagion  known  as  a  "model  of  random 
colonies"  also  proves  useful  in  the  study  of  many  biological  phenomena.  This 
type  of  contagion  is  described  by  means  of  generalized  distributions. 

The  main  purpose  of  this  paper  is  an  expository  presentation  of  some 
results  on  contagious  distributions  in  which  the  relation  between  a  certain  class 
of  compound  and  of  generalized  distributions  is  utilized.  Some  general  families 
of  contagious  distributions  are  described  and  their  shape  characteristics  indicated. 
Some  consideration  is  also  given  as  to  the  selection  of  an  appropriate  family  of 
distributions  when  one  is  attempting  to  fit  data  on  the  basis  of  an  underlying 
model  of  the  type  described  here. 


Sponsored  by  the  Mathematics  Research  Center,  U.  S.  Army,  Madison,  Wisconsin, 
under  Contract  No.  DA-11-0Z2-ORD-2059. 
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2.  Contagion 

2. 1  Apparent  contagion 

This  type  of  contagion  is  the  result  of  a  mixture  of  distributions  arising 
through  the  distribution  of  a  parameter  in  an  initial  distribution.  A  well  known 
example  is  the  result  of  applying  a  Gamma  distribution  to  the  mean  of  a  Poisson 
distribution,  (cf.  Greenwood  and  Yule  ( 1920)) .  Specifically,  let  the  mean  of 
the  initial  distribution  (the  Poisson,  in  this  example)  be  X.  .  The  probability 
generating  function  ( p.  g.  f.  )  of  this  Poisson  distribution  is 


V 

By  the  p.g.  f.  g(z)  of  a  random  variable  X  we  mean  Ez  ,  where 
E  denotes  expectation.  When  the  values  which  X  may  assume  ( with  non -zero 
probability)  are  non-negative  integers  then  the  p.  g.  f .  expressed  as  a  power 
series  yields  the  probabilities  as  the  coefficients  in  the  series.  Thus 

00 

g{z)  =  Yj  z'^P{X  =  r}  .  (2) 

r=0 


On  applying  a  Gamma  distribution  with  probability  density 
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■f 


p(X)  = 


-a\  p-1 

-  e  X 

r(p) 


\  >  0 
a  >  0 
P  >  0 


(3) 


to  the  mean  X  in  the  above  initial  Poisson  distribution  we  obtain  for  the 
p,  g.  f .  of  the  resulting  distribution 


HP) 


a 


(4) 


If  we  write 


q  =  l  +  p;  P=k,  thep,  g.f.  in  (4)  becomes 


(q-pz)'*^  (5) 

which  is  a  well  known  form  for  the  p.  g.  f.  of  a  Negative  Binomial  distribution. 
This  is  an  example  of  apparent  contagion,  and  on  the  basis  of  this  model  the 
Negative  Binomial  may  be  regarded  as  a  compound  Poisson  distribution.  A 
formal  definition  of  a  compound  distribution  will  be  given  in  section  3. 

2.  2  True  contagion 

The  following  urn  scheme  due  to  Polya  (1930)  affords  an  example  of 
true  contagion  and  leads  in  a  relatively  simple  way  to  the  Negative  Binomial 
distribution.  Let  an  urn  contain  Np  white  and  Nq  black  balls,  where 
p  +  q  =  1 .  Suppose  n  successive  drawings  of  a  ball  are  made  according  to  the 
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following  rule:  If  a  white  ball  is  drawn  it  is  replaced  and  N6  additional 
white  balls  are  put  in  the  urn.  Likewise,  if  a  black  ball  is  drawn  it  is 
replaced  and  N6  additional  black  balls  are  put  in  the  urn. 

Polya  ( op.  cit.  )  shows  that  when  p-»0,  6~*0,  n-*oo  such  that 

np  and  n6  are  held  constant  the  distribution  of  the  number  of  white  balls 
approaches  that  of  a  Negative  Binomial  random  variable.  It  is  a  fact  of 
considerable  interest  ( cf.  Arbous  and  Kerrich  ( 1951);  Fitzpatrick  (1958))  that 
the  Negative  Binomial  may  be  regarded  as  arising  through  apparent  contagion 
or  through  true  contagion. 

2.  3  Model  of  random  colonies 

This  model  has  wide  application  in  biological  as  well  as  other  phenomena. 
An  example  illustrating  the  mechanism  of  this  model  is  afforded  by  the  distribution 
of  insects  over  a  field.  Suppose  the  insects  are  larvae  which  hatched  from  egg- 
masses.  These  egg-masses  may  be  regarded  as  cluster  centers  or  "random 
colonies".  Actually  two  underlying  distributions  are  involved  in  the  final 
distribution  of  the  larvae.  First,  there  is  the  distribution  of  the  egg-masses 
over  the  field;  second,  there  is  the  distribution  of  larvae  which  leave  an 
egg-mass  and  arrive  at  a  particular  location  selected  at  random. 

Specifically,  let  the  distribution  of  egg-masses  be  Poisson  with  p.  g.f. 

g^{  z)  =  e^  =  Pq  +  P^z  +  P^z^  +  ...  (  6) 

where  P^  =  e  ^  X.'^/r!  is  the  probability  that  exactly  r  egg-masses  are 
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represented  on  a  randomly  selected  location.  Suppose,  further,  that  the 
number  of  larvae  from  an  egg-mass  which  reach  the  location  is  given  by  a 
Logarithmic  distribution  with  p.  g.f. 

p  >  0 

=  1  -  a  log(  q  -  pz)  O' >  0  (7) 

q  =  1  +  p 

Now,  the  number  of  larvae  at  the  random  location  may  be  due  to 
0,  1,  2,  ...  egg-masses.  Consequently,  the  over-all  distribution  of  larvae 
will  have  p.  g.  f. 


QO 

g(z)  =  Z  P,.{ g;>( =  g,{go(z)} 

r=0  ^ 


which,  in  the  present  instance,  reduces  to 


,  ,  -alog(q-pz)  , 
g(z)  =  e  (g-pz) 


(8) 


(9) 


the  p.  g.  f.  of  a  Negative  Binomial  distribution.  On  the  basis  of  this  model 
the  Negative  Binomial  may  be  regarded  as  a  generalized  Poisson  distribution 


This  Logarithmic  distribution  is  more  general  than  that  considered  by  Fisher, 
Corbett,  and  Williams  (1943)  or  by  Jones,  Mollison,  and  Quenouille  ( 1948)  in 
that  it  permits  a  positive  probability  for  the  occurrence  of  zero  counts.  When 
1  -  a  log  q  =  0  it  reduces  to  the  more  specialized  Logarithmic  distribution. 
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( cf.  Quenouille  ( 1949)) .  A  formal  relation  between  certain  families  of 
compound  and  generalized  distributions  will  be  considered  in  section  3. 

3.  A  formal  relation  between  some  compound  and  generalized  distributions. 

For  convenience  we  employ  the  definitions  and  notation  employed  by 
Gurland  (1957). 

Definition  1  Compound  distribution 

Let  the  random  variable  have  the  distribution  function  F^(xj0) 

for  a  given  value  of  the  variable  and  of  the  parameter  6.  Suppose  now 

that  0  is  regarded  as  a  random  variable  X^,  say,  with  distribution  function 
Denote  by  X^a  X^  the  random  variable  with  distribution  function  F 
given  by 


F(x^)  =  /  F^(x^|cx2)dF2(x2)  (10) 

for  each  value  of  X^  ,  where  D  is  the  domain  of  F^  •  Here  c  is  a  constant 
which  is  arbitrary.  (Values  of  c  for  which  (10)  is  not  a  distribution  function 
are  excluded) .  The  random  variable  X^  a  x^  ( uniquely  defined  here  apart  from 
the  constant  c)  is  called  a  compound  X^  variable  with  respect  to  the 
"compounder"  X^  • 

In  the  example  of  2.1,  X^  is  a  Poisson  random  variable  with  mean  \ 
and  X^  is  a  Gamma  random  variable  with  probability  density  given  by  (3). 
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The  constant  c  was  taken  as  unity,  but  in  this  example  there  is  no  los  s  of 
generality  because  (4)  would  have  become,  with  c  in  place  of  unity, 

[1  -  —  (  z  -  1)  J  ^  ;  and  we  would  then  define  p  =  c/a  instead  of  l/a  . 

Definition  2  Equivalent  distributions 

Suppose  the  random  variables  have  distribution  functions 

F^(x/a')  ,  F^ix/P)  respectively,  a  and/or  p  may  be  multi- dimensional. 

If  for  each  a  there  exists  some  p  and  for  each  p  there  exists  some  a 
such  that  F^(x/a)  =  T^(y./^)  whatever  be  x,  the  random  variables  X^  , 

X^  are  said  to  be  equivalent,  and  we  write  Xj^  ^X^  • 

It  is  often  convenient  to  represent  a  random  variable  by  the  name  of 
its  corresponding  distribution.  Thus,  in  the  case  of  the  compound  Poisson 
considered  in  section  2. 1  we  might  write 

Poisson  A  Gamma  Negative  Binomial  (10) 

It  may  happen  as  in  several  cases  considered  below  that  the  initial 
distribution  being  compounded  may  have  several  parameters  but  only  a 
particular  one  of  them  is  regarded  as  a  random  variable.  In  such  cases 
the  notation  X^a  X^  as  employed  in  (10)  might  become  ambiguous; 
for  these  cases  the  notation  will  be  modified  as  required.  In  the  example 
above  represented  by  (10)  there  is  no  ambiguity  since  the  Poisson  has  only 
one  parameter,  namely,  the  mean. 


i 


Deflnltlcm  3 


Generalized  distribution 


Let  the  random  variables  X^,  have  p.g.f.  's  g^(z),  g^lz) 
respectively.  Denote  by  X^  v  X^  the  random  variable  with  p.g.f. 
g^(g2(z)).  Then  Xj^v  is  called  a  generalized  X^  variable  with 
respect  to  the  "generalizer"  X^- 

Theorem  , 

0 

Let  be  a  random  variable  with  p.g.f.  [h(z)]  where  6  is  a 
given  parameter.  Suppose  now  0  is  regarded  as  a  random  variable  X  ^  , 
say,  with  distribution  function  and  p.g.f.  g^  .  Then,  whatever  be 


X.A  X_~X  v  X, 

1  4m  ^  1 

assuming  the  p.g.f.  of  these  random  variables  exists. 

Proof 

The  proof  follows  immediately  from  the  definition  of  compound  and 
generalized  distributions.  In  fact,  the  p.g.f.  of  X,  a  X-  is  given  by 

/  [h(z)]'^^dF  (X) 

D 

while  that  of  X  ^  v  X  ^  is  given  by 
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g2{gi(z)}  =  /  [h(z)]®^dF2(x)  . 

These  are,  of  course,  equal,  when  c  =  0  . 

It  is  interesting  to  note  the  role  of  the  constant  c  introduced  in 
the  definition  of  compound  random  variable. 

As  an  example  of  applying  the  above  theorem  let  X .  and  X  both 

X  Lt 

m 

be  Poisson  random  variables.  Then 

Poisson  A  Poisson Poisson  V  Poisson  .  (12) 

This  distribution  is  called  the  Neyman  Type  A  (cf.  Neyman,  1939),  and  may 
be  interpreted  both  as  a  compound  Poisson  and  as  a  generalized  Poisson,  as 
was  pointed  out  by  Feller  (  1943). 

It  should  be  noted  both  in  the  theorem  and  in  the  above  definitions  that 
the  random  variables  X^  ,  X^  need  not  be  discrete.  For  X^  the  p.  g.f.  is 

Ez  and  likewise,  of  course,  for  X2  •  The  following  example  illustrates  the 
point. 

PoissonAGamma  ~  Gamma  V  Poisson  .  (13) 

To  verify  (13)  we  note  that  Poisson  a  Gamma  is  equivalent  to  a 

Negative  Binomial.  It  suffices,  therefore,  to  show  that  Gamma  v  Poisson 

is  also  equivalent  to  a  Negative  Binomial.  Now  the  moment  generating 
tX 

function  Ee  of  the  Gamma  random  variable  X  with  probability  density 
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given  by  ( 3 )  is 


(1  -  - 

O' 


(14) 


Replacing  e  by  z  yields  the  p.  g.  f. 


a 


(15) 


If  the  p.  g.  f. 


eX(2-l)  Poisson  is  substituted  for  z  in  (15)  we  obtain 


[1  -  —  (z-1)]'^ 

which  corresponds  to  a  Negative  Binomial  as  required. 

Let  us  next  consider  examples  of  compounding  a  distribution  which 
involves  more  than  one  parameter.  Take,  for  instance,  a  Negative  Binomial 
with  p.  g.  f.  (  q  -  pz)  .  For  brevity  we  shall  refer  to  this  distribution  as 

itf 

Pascal  (k,  p).  The  above  theorem  and  relation  (11)  apply  if  the  index 
parameter  k  is  regarded  as  the  random  variable  X  .  Taking  X  to  be 
a  Poisson  and  a  Gamma  random  variable  respectively  yields  the  following 
relations 

Although  the  term  "Pascal  distribution"  commonly  refers  to  the  particular 
case  of  a  Negative  Binomial  distribution  with  index  parameter  k  an  integer, 
we  employ  the  same  terminology  for  the  Negative  Binomial  for  convenience 
in  writing,  (cf.  Garland  (  1959)  Katti  and  Gurland  ( 1962)) 
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Pascal  (k,  p)  ^  Poisson  ~  Poisson  V  Pascal  (16) 

Pascal  (k,  p)  Gamma  Gamma  v  Pascal  (17) 

It  should  be  noted  the  letter  k  Is  inserted  below  the  symbol  "  A  " 
to  obviate  the  possible  ambiguity  mentioned  earlier. 

The  examples  in  sections  2. 1  and  2.  3  exhibiting  the  Pascal 
distribution  as  a  compound  Poisson  and  generalized  Poisson,  respectively, 
can  be  expressed  symbolically  as 

Poisson  A  Gamma Poisson  V  Logarithmic  .  (18) 

It  was  shown  by  Gurland  (1957)  that  this  relation  can  be  extended. 

Thus, 

(Poisson  A  Gamma)  a  Gamma  (Poisson  v  Logarithmic)  v  Logarithmic  (19) 

that  is 

Pascal  A  Gamma Pascal  V  Logarithmic  .  (20) 

This  extension  can,  in  fact,  be  carried  out  any  number  of  times. 

The  next  step,  for  example,  would  be 

(  Pascal  A  Gamma)  a  Gamma  (Pascal  v  Logarithmic)  v  Logarithmic  (21) 


and  so  on. 
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4.  A  generalization  of  Neyman*s  class  of  contagious  distributions 

Let  us  consider  the  example  in  section  2,  3  in  more  detail  and  in 
a  modified  form.  As  before,  let  the  probability  that  exactly  r  egg-masses 
are  represented  on  a  randomly  selected  location  be  given  by  a  Poisson 
distribution 


P 

r 


X.  [/r  ! 


(22) 


Before  we  were  interested  merely  in  the  number  of  larvae  which  move 
from  an  egg-mass  to  a  particular  location.  In  the  present  Instance  we  are 
also  interested  in  the  number  of  survivors  in  an  egg-mass,  that  is,  the 
number  of  larvae  that  hatch  out.  Suppose  the  number  of  survivors  in  an 
egg-mass  is  a  Poisson  random  variable  with  mean  X  ,  say.  That  is, 
the  probability  that  there  are  exactly  n  survivors  in  an  egg-mass  is 
given  by 

e'^  x"/n  !  (23) 

Suppose  that  in  a  particular  egg-mass  there  are  n  survivors.  The 
probability  that  exactly  s  of  them  will  be  found  at  a  particular  location 
will  be  assumed  to  be 

( g ) p  (i-p) 


(24) 
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which  corresponds  to  a  Binomial  distribution  with  parameters  n,  p. 

A  straightforward  application  of  the  notions  of  compound  and 
generalized  distributions  discussed  In  sections  2  and  3  yields  as 
the  p.  g.  f .  of  the  distribution  of  larvae 

Xj[g(z)  -1] 

e  (25) 

where  g(  z)  is  the  p.  g.  f.  of  the  Binomial  distribution  in  (24)  compounded 
with  the  Poisson  distribution  in  (23).  A  simple  argument  utilizing  the  relation 

Binomial  (n,  p)  a  Poisson  ~  Poisson  v  Binomial 
’  n 

shows  that 

g(z)  (26) 

which  corresponds  to  a  Poisson.  Consequently,  the  resulting  distribution 
given  by  (25)  is  a  Neyman  Type  A  . 

As  a  first  step  in  extending  this  family  of  distributions  suppose  the 
parameter  p  in  (24)  may  ( more  realistically)  be  regarded  as  a  random 
variable,  following,  say,  a  Beta  distribution  with  probability  density 

0<X<1  (27) 

a  >  0;  p  >  0 
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On  compounding  the  distribution  in  (  Z6)  with  this  Beta  distribution 
we  obtain  the  p.  g.  f.  g^(z),  say,  where 

/e^^<^'^^x®‘^l-x)^'^dx=  iF^{a,  a  +  p,  X(z.l)}  (28) 

and  where  is  the  well-known  confluent  hypergeometrlc  function.  For 
convenience  let  us  refer  to  the  distribution  in  (28)  as  Type  .  Then  the 
distribution  of  larvae  is  a  generalized  Poisson  represented  by 

Poisson  V  Type  H  j  (29) 

as  obtained  by  Gurland  (  1958).  If  in  (28)  we  set  a  =  1,  the  family  (29) 
reduces  to  that  of  Beall  and  Rescla  ( 1953) . 

As  a  further  step  in  extending  Neyman's  family  of  distributions  the 
parameter  \  in  (  23)  may  also  be  regarded  as  a  random  variable.  This 
is  a  realistic  consideration  because  different  egg-masses  would  conceivably 
be  associated  with  different  probabilities  of  survival.  If  we  assume  a  Gamma 
distribution  for  X  ,  then  (  23)  becomes  a  Pascal  distribution  with  p.  g.  f.  , 

'^1 

say,  (q^  -  p^z)  .  Treating  p  in  (24)  as  a  random  variable  as  before, 
the  distribution  of  larvae  becomes  a  generalized  Poisson  with  p.g.  f. 

Xj[g2(z)-1] 

e 
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Tyhere 


k,  ■!=<=  jFitk,,  «,  a  +  p,  Pi(z-l))  .  (30) 

0  ft  14*1* 


[1  -  PjX(Z-l)] 


If,  for  convenience,  we  refer  to  the  distribution  corresponding  to  92(2)  as 
Type  H  ^  ,  then  the  distribution  of  larvae  may  be  represented  by 

Poisson  V  Type  H  ^  .  (31) 

If,  in  addition  to  the  above  compounding  we  also  allow  the  parameter 
in  (22)  to  follow  a  Gamma  distribution,  the  distribution  of  egg-masses 
becomes  a  Pascal.  In  analogy  with  (  29)  and  (31)  we  obtain  two  more  families 
of  distributions  represented  by 

Pascal  V  Type  H  ^  (32) 

Pascal  V  Type  (33) 

respectively. 

As  some  of  these  general  families  contain  many  parameters  and  are  not 
particularly  simple  to  work  with  it  would  be  Interesting  to  examine  their 
characteristics  in  the  hope  of  finding  simpler  families  which  might  be 
similar  in  shape.  Some  results  along  these  lines  are  considered  in 
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section  5  . 

5.  Skewness  and  kurtosls  of  some  families  of  distributions 

Among  the  usual  characteristics  of  Interest  in  assessing  the  shape 

of  a  distribution  are  the  skewness  and  kurtosls.  These  are  measured  by 
3/2  2 

>  respectively,  where  ji,,  p.  ,  (x.  are  central  moments 

of  the  orders  indicated  by  the  subscripts.  To  standardize  the  distributions 
under  comparison  in  some  reasonable  sense,  we  have  reparametrized  them 
to  have  the  same  mean  kp  and  the  same  variance  kp(l  +  p)  as  the  Negative 
Binomial.  This  is  suggested  by  a  similar  comparison  made  by  Anscombe  (1950) 
in  the  case  of  a  few  two-parameter  families  of  distributions  he  compared  with 
the  Negative  Binomial. 

As  measures  of  skewness  and  kurtosls  we  have  also  employed  the 
3  4 

same  quantities  as  Anscombe  (op.  cit.  ),  where 

K,  ^ ,  and  K,  are  the  third  and  fourth  factorial  cumulants.  For  the 
(3)  (4) 

distributions  we  have  considered  these  measures  are  particularly  convenient 
both  from  the  standpoint  of  calculation  and  from  the  fact  the  final  measures 
obtained  do  not  involve  the  parameters  k,  p  . 

A  note  of  caution  should  be  made,  however,  in  the  use  of  the  above 
quantities  as  measures  of  skewness  and  kurtosls.  Since 
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[i  =  K  +  a  function  involving  only  the  first  two  moments 
\  ^  } 


=  K.  +  6»<,  ,  V  +  a  function  involving  only  the  first  two  moments 
4  (4)  (3) 


and  the  first  two  moments  of  all  the  distributions  under  comparison  are  the 

3  4 

same  it  follows  that  when  k  /kp  and  k  /kp  are  both  increasing  or 
both  decreasing  then  the  distributions  can,  in  fact,  be  ordered  according  to 
skewness  and  kurtosis.  For  all  the  two-parameter  families  appearing  in 
Table  1  this  is  actually  the  case.  For  those  families  containing  more  than 
two  parameters  and  involving  the  Type  H  ^  or  Type  H  ^  distributions  there 
are  some  values  of  the  parameters  for  which  the  above  quantities  involving 
factorial  cumulants  increase  or  decrease  in  opposite  directions.  The  interval 
between  minimum  and  maximum  values,  however,  is  of  some  value  in  the 
comparison  of  the  shapes  of  the  various  distributions  in  Table  1.  Each  pair 
of  numbers  in  the  table  enclosed  in  parentheses  indicates  such  an  interval. 

As  a  further  explanation  of  the  distributions  appearing  in  Table  1  , 
the  Neyman  B  and  Neyman  C  are  special  cases  of  the  family  (29)  with 
Q!  =  1  and  p  =  1,  2  respectively  in  (27).  The  Polya-Aeppli  distribution  is 
also  a  special  case  of  the  above  family  with  a  =  1  and  p  =  oo .  (cf.  Gurland 


(1958)) .  The  Polya-Aeppli  distribution  can  also  be  defined  formally  as  a  special 
case  of  the  Poisson  v  Pascal  with  p.  g.  f.  -1]  ^ 
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TABLE  1 

1  Measure  of  skewness  and  kuitosis  for  some  distributions  with  the  same 

• 

i  first  two  moments  kp,  kp(  1  +  p) 

1 

1  Distribution 

"(3) 

•2 

'*(4) 

A 

f* 

jl 

kp 

kp 

h 

i 

;;  Poisson  V  Binomial 

• 

!> 

(0, 1) 

(0,  1) 

if  Neyman  A 

1 

1 

f 

i:  Neyman  B 

9/8 

27/20 

1 

i  Neyman  C 

6/5 

S/5 

Polya-Aeppli 

3/2 

3 

f 

Pascal 

2 

6 

Pascal  /V  Gamma 

(1.75,  2) 

(4.373,  6) 

i 

Poisson  V  Pascal 

(1,  2) 

(1,  6) 

1 

■ 

• 

' 

i  Pascal  V  Poisson 

(1,  2) 

(1,  6) 

• 

Pascal  V  Pascal 

(1,  2) 

(1,  6) 

Poisson  V  H  j 

(1,  2) 

(1,  6) 

Pascal  V 

(1,  2) 

(1,  6) 

- 

Poisson  V  H  2 

(1,  4) 

(1,  36) 

1 

I 

! 

T 
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It  is  apparent  from  Table  1  that  for  all  the  two- parameter  families 
under  consideration  the  skewness  and  kurtosis  are  both  increasing.  From  the 
Neyman  A  On  through  to  the  Pascal  there  is  a  range  ( 1,  2)  for  the  skewness 
measure  and  a  range  (1,  6)  for  the  kurtosis  measure.  It  is  particularly 
interesting  that  for  the  Poisson  v  Pascal,  Pascal  v  Poisson,  Pascal  v  Pascal, 
Poisson  V  ,,  and  Pascal  v  the  range  between  minimum  and  maximum 
for  the  skewness  measure  is  also  (1,  2)  and  for  the  kurtosis  measure  is  also 
(1,  6) .  Note  that  the  Poisson  v  Pascal  and  Pascal  v  Poisson  are  three -parameter 
families  whereas  the  Pascal  v  Pascal,  Poisson  v  involve  four  parameters, 
the  Pascal  v  ,  Poisson  v  involve  five  parameters. 

As  the  Poisson  v  Pascal  and  Pascal  v  Poisson  are  simpler  fapiilies  than 
those  involving  more  parameters  their  flexibility  of  shape  is  a  recommendation 
in  favor  of  their  use.  Of  these  two  distributions  the  Poisson  v  Pascal  lends 
itself  to  simpler  computation  of  the  probabilities  and  extimation  of  the 
parameters  required  in  the  fitting  of  the  distribution  to  observed  data. 

It  is  also  evident  from  Table  1  that  the  Poisson  v  Binomial  covers  the 
range  of  skewness  (0,  1)  and  the  range  of  kurtosis  (0,  1).  As  the  corresponding 
ranges  for  the  Poisson  v  Pascal  are  (1,  2)  and  (1,  6)  ,  this  shows  that  these 
relatively  simple  three -parameter  families,  the  Poisson  v  Binomial  and  the 
Poisson  V  Pascal  cover  a  wide  range  of  possible  shapes.  Methods  of 
estimating  the  parameters  and  computing  the  probabilities  in  these  distributions 
appear  in  a  number  of  recent  papers.  Shumway  and  Gurland  (1960),  (1961), 

Kattl  and  Gurland  (1961),  (1962  a)  . 
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6.  Considerations  In  the  pholce  of  a  family  of  contagious  distributions 

From  the  preceding  sections  it  is  evident  that  many  forms  of  compound 
and  generalized  distributions  are  possible.  As  some  of  these  distributions  are 
simpler  than  others,  yet  are  meaningful  biologically  and  do  not  suffer  seriously 
in  loss  of  flexibility,  the  following  three  criteria  might  be  suggested  as 
important  in  the  choice  of  an  appropriate  family 

(i)  Simplicity 

(ii)  Flexibility 

(iii)  Meaningful  parameters 

The  Negative  Binomial  is  one  of  the  most  widely  used  discrete 
distributions  because  it  is  relatively  simple  and  is  very  convenient 
computationally  although  the  estimation  of  the  parameters  is  rather  tedious 
if  the  method  of  maximum  likelihood  is  employed  (cf.  Fisher(1953)  Bliss  (  1953)), 

The  Neyman  Type  A  distribution,  a  two-parameter  family,  is  also  widely 
used  (cf.  Beall(1940)  Evans (1953 ))but  it  is  not  as  convenient  in  computing 
probabilities  as  is  the  Negative  Binomial.  Methods  have  been  devised  for 
simplifying  these  computations  (cf.  Douglas  (1955)). The  estimation  of  the 
parameters  by  maximum  likelihood  is  also  tedious,  but  alternative  methods 
which  are  simpler  and  retain  high  efficiency  have  been  suggested  both  for  the 
Negative  Binomial  and  the  Neyman  Type  A  by  Katti  and  Gurland  (1962b) . 
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If  none  of  the  relatively  simple  distributions  such  as  the  Poisson, 
Negative  Binomial,  Neyman  Type  A  is  appropriate  then  one  of  the  the  three - 
parameter  families  suggested  in  section  5  might  be  utilized.  On  the  basis 
of  only  a  few  Isolated  experiments  it  is,  of  course,  not  possible  to  distinguish 
effectively  between  competing  distributions,  in  which  case  the  simpler  ones, 
if  they  provide  a  good  fit,  are  to  be  preferred.  On  the  other  hand,  if  many 
experiments  are  carried  out  in  the  same  classes  of  situations,  and  if  there 
is  ample  evidence  that  none  of  the  simple  distributions  is  appropriate,  then 
a  more  flexible  distribution  such  as  the  Poisson  v  Pascal,  say,  might  be  tried. 

The  Poisson  v  Pascal  affords  an  attractive  alternative  because  it  is  also 
relatively  simple  ( almost  as  easy  to  work  with  as  the  Neyman  Type  A)  and 
because  it  subsumes  the  Negative  Binomial,  the  Neyman  Type  A,  and  the 
Poisson  as  limiting  cases  (cf.  Katti  and  Gurland  (1961)).  Specifically,  let 
a  Poisson  V  Pascal  have  p.  g.  f.  g(z)  .  Table  2  gives 

the  limiting  form  of  g(  z)  for  different  passages  to  the  limit. 

TABLE  2 

Some  limiting  forms  of  the  Poisson  v  Pascal  distribution 
No.  Limits  taken  Limiting  p.  g.  f.  Name  of  limiting  distribution 

Neyman  Type  A 

Negative  Binomial 


k  -►  00,  p  0 
pk  =  \  , 


X,(z-1) 

X[e'  -1] 


k  0  ,  \  00 


Xk  =  k. 


(  q  -  pz) 


p  -*  0,  X  -♦  00  X  (z-1) 
e  ^ 


Xkp  =  X 


1 


3 


Poisson 
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Some  methods  for  simplifying  the  computation  of  the  probabilities 
and  for  obtaining  the  maximum  likelihood  estimates  of  the  parameters  in  a 
Poisson  V  Pascal  distribution  are  given  by  Shumway  and  Gurland  (1961). 
Estimation  of  the  parameters  in  this  distribution  by  the  technique  of  minimum 
chi-square  is  considered  by  Katti  and  Gurland  (1961).  In  Table  3,  taken  from 
this  paper,  we  see  the  results  of  fitting  a  Poisson  v  Pascal  and  a  Polya-Aeppli 
to  some  data  of  Beall  and  Rescia  (1953). 

TABLE  3 


Fit  of  the  observed  frequency  of  Lespedeza  Capitata 


Plants 

from  Table  V  of  Beall-Rescia  (1953) 

Expected  frequency  due 
Observed  to  Poisson  v  Pascal 

Frequency  (Method  of  moments) 

Expected  frequency  as 
in  Beall-Rescia  (1953) 

0 

7178 

7185.  0 

7217. 6 

1 

286 

276.  0 

218.  6 

2 

93 

94.  5 

105.  5 

3 

40 

41.  5 

50.  9 

4 

24 

20.  2 

24.5 

5 

7 

10.4 

11.  8 

6 

5 

5.  6 

5.  7 

7 

1 

3.  1 

2.  8 

8 

2 

1.7 

1.  3 

9 

1 

1.  0 

.  6 

10 

2 

.6 

.  3 

11  + 

1 

.  3 

.4 

2 

X 

9.58 

42.  97 

Degrees  of  freedom 


8 


9 
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It  is  evident  from  the  X  values  at  the  foot  of  Table  3  that  the 
Poisson  V  Pascal  definitely  provides  a  much  closer  fit.  This  is  not 
surprising  because  of  the  much  greater  flexibility  of  the  Poisson  v  Pascal. 

For  a  lower  range  of  skewness  and  kurtosis  the  information  in 
Table  1  suggest  the  use  of  the  Poisson  v  Binomial  distribution.  From 
the  form  of  its  p.  g.  f.  g{z)  =  it  is  evident  this 

distribution  converges  rather  quickly  to  the  Neyman  Type  A  distribution 
as  n  -•  00,  p  -•  0  with  np  constant.  For  small  values  of  n  ,  however, 
it  may  be  quite  useful,  and  has  been  applied  by  McGuire  et  al.  (1956) 
and  Sprott(  1958) . 

TABLE  4 

Fit  of  the  observed  frequency  of  Pyrausta  Nubilalis 
from  Distribution  6  of  Me  Guire  et  al.  ( 1957 ) 


Corn  Borers 

Observed 

Frequency 

Expected  frequency  due  to 
Poisson  V  Binomial  (n  =  2) 

Expected  frequency  due  to 
Poisson  V  Binomial (n  =  3) 

0 

907 

906.  18 

907. 66 

1 

275 

276. 69 

277. 24 

2 

88 

89.  92 

86.  50 

3 

23 

18.  86 

20.  14 

/ 

4 

3 

4.  35 

3.  23 

2 

X 

1.  39 

0.  47 

Degrees  of 
freedom 

2 

2 
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In  Table  4  are  shown  the  results  of  fitting  a  Poisson  V  Binomial  to 
some  data  of  McGuire  et  al.  (1957)  by  the  method  of  maximum  likelihood. 
This  table  is  partially  reproduced  from  Shumway  and  Gurland  (1961).  It  is 
quite  evident  that  a  good  fit  is  provided  in  the  case  n  =  2  and  an  even  better 
fit  in  the  case  n  =  3  .  Techniques  for  est'  -r  ating  the  parameters  of  a  Poisson 
V  Binomial  based  on  minimum  chi-square  have  been  developed  by  Katti  and 
Gurland  ( 1962  a) . 


7.  Conclusion 

One  might  ask  what  is  the  purpose  of  fitting  data  by  discrete 
distributions  such  as  those  considered  here.  Apropos  of  this  question  it 
is  interesting  that  in  the  application  of  most  standard  statistical  techniques 
based  on  the  Normal  distribution  a  test  of  fit  is  not  usually  performed.  This 
may  be  due  to  a  wide  experience  of  a  good  fit  by  the  Normal  distribution  or 
to  the  property  of  robustness  (cf.  Box  and  Anderson  (1955))  enjoyed  by  many 
tests  which  are  based  on  a  Normal  population  but  in  applying  which  the  data 
is  actually  from  a  non- Normal  population. 

In  the  case  of  data  from  a  discrete  distribution  many  underlying  forms 
are  possible  and  the  fittings  based  on  these  forms  may  be  quite  different,  A 
knowledge  of  the  underlying  distribution  makes  it  at  least  theoretically 
possible  to  construct  tests  and  estimate  parameters  for  the  purpose  of 
making  statistical  inference. 
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It  is  also  Important  for  the  distributions  fitted  to  biological  data 
to  be  based  on  models  which  have  a  reasonable  biological  meaning.  The 
compound  and  generalized  distributions,  including  the  Negative  Binomial, 
Neyman  Type  A,  Poisson  v  Pascal,  and  many  others,  afford  Interesting 
possibilities  of  such  distributions,  because  they  provide  a  simple 
mechanism  for  explaining  the  "dumpiness**  which  is  so  characteristic 
of  much  biological  data. 


-26- 


#380 


REFERENCES 

Anscombe,  F.  J.  (1950).  Sampling  theory  of  the  negative  binomial  and 
logarithmic  series  distributions.  Biometrlka  37,  358-382, 

Aibous,  A.  G.  and  Kerrlch,  J.  E.  (1951).  Accident  statistics  and  the 
concept  of  accident  proneness.  Part  I:  A  critical  evaluation. 
Part  II;  The  mathematical  background.  Biometrics  7,  340-429. 

Beall,  G.  (1940).  The  fit  and  significance  of  contagious  distributions 
when  applied  to  observations  on  larval  Insects.  Ecology  21, 
460-474. 

Beall,  G.  and  Rescla,  R,  R.  (1953).  A  generalization  of  Neyman's 
contagious  distributions.  Biometrics  9,  354-386. 

Bliss,  C.  I.  (1953).  Fitting  the  negative  binomial  distribution  to 
biological  data.  Biometrics  9,  176-196. 

Box,  G.  E.  P.  and  Anderson,  S.  L.  (1955),  Permutation  theory  in  the 
derivation  of  robust  criteria  and  the  study  of  departures  from 
assumption.  Journal  of  the  Royal  Statistical  Society,  Series  B, 
17,  1-34. 

Douglas,  J.  B,  (1955).  Fitting  the  Neyman  Type  A  (two  parameter) 
contagious  distribution.  Biometrics  11,  149-173. 

Edwards,  Carol  B.  and  Gurland,  John  (1961).  A  class  of  distributions 
applicable  to  accidents.  Journal  of  American  Statistical 


Association.  56,  503-517 


#380 


-27- 


Evans,  D.  A.  (1953).  Experimental  evidence  concerning  contagious 
distributions  in  ecology.  Biometrika  40,  186-210. 

Feller,  W.  (1943).  On  a  general  class  of  contagious  distributions. 

Annals  of  Mathematical  Statistics  14,  389-400. 

Fisher,  R.  A.,  Corbett,  A.  S. ,  and  Williams,  C.  B.  (1943).  The  relation 
between  the  number  of  species  and  the  number  of  individuals  in  a 
random  sample  of  an  animal  population.  Journal  of  Animal  Ecology 
12,  42-68. 

Fisher,  R.  A.  (1953).  Note  on  the  efficient  fitting  of  the  negative  binomial. 
Biometrics  9,  197-199. 

Fitzpatrick,  Robert  (1958).  The  detection  of  individual  differences  in  accident 
susceptibility.  Biometrics  14,  50-66. 

Greenwood,  M.  and  Yule,  G.  Udny  (1920).  An  inquiry  into  the  nature  of 
frequency  distributions  representative  of  multiple  happenings  with 
particular  reference  to  the  occurrence  of  multiple  attacks  of  disease 
or  of  repeated  accidents.  Journal  of  the  Royal  Statistical  Society  83, 
255-279. 

Gurland,  John  (1957).  Some  interrelations  among  compound  and  generalized 
distributions.  Biometrika  44,  265-268. 

Gurland,  John  (1958).  A  generalized  class  of  contagious  distributions. 
Biometrics  14,  229-249. 

Gurland,  John  (1959).  Some  applications  of  the  negative  binomial  and  other 
coritagious  distributions.  American  Journal  of  Public  Health  39, 


1388-1399. 


-28- 


#380 


Jones,  P.  C.T. ,  MoUlson  J.  E. ,  and  Quenouille,  M.  H.  (1948).  A  technique 
for  the  quantitative  estimation  of  soil  micro-organisms.  Journal  of 
General  Microbiology  2,  54-69. 

Kattl,  S.  K.  and  Gurland,  John  (1961).  The  Poisson  Pascal  distribution. 
Biometrics  17,  527-538. 

Katti,  S.  K.  and  Gurland,  John  (1962  a).  Efficiency  of  certain  methods  of 
estimation  for  the  negative  binomial  and  the  Neyman  Type  A 
distributions.  Blometrika  49,  215-226. 

Kattl,  S.  K.  and  Gurland,  John  (1962  b).  Some  methods  of  estimation  for  the 
Poisson  Binomial  distribution.  Biometrics  18,  42-51. 

McGuire,  Judson  U. ,  Brindley,  T.  A.  and  Bancroft,  T.  A.  (1956).  The 

distribution  of  European  com-borer  larvae  Pyrausta  Nubllalis  (  HBN), 
in  field  com.  Biometrics  13,  65-78. 

Neyman,  J.  (1939).  On  a  new  class  of  contagious  distributions  applicable 
in  entomology  and  bacteriology.  (1939).  Annals  of  Mathematical 
Statistics  10,  35-57. 

Polya,  G.  (1930).  Sur  quelques  points  de  la  th^orie  des  probabilit^s. 
Annales  Institut  Henri  Polncar^  1,  117-161. 

Quenouille,  M.  H.  (1949).  A  relation  between  the  logarithmic,  Poisson, 
and  negative  binomial  series.  Biometrics  5,  162-164. 

Shumway,  Robert,  and  Gurland,  John  (1960).  Fitting  the  Poisson  Binomial 
distribution.  Biometrics  16,  522-533. 


#380 


-29- 


Shumway,  Robert  and  Gurland,  John  (1961).  A  fitting  procedure  for  some 
generalized  Poisson  distributions.  Skandenavisk  Aktuarietidskrift. 
87-108, 

Sprott,  D.  A.  (1958).  The  method  of  maximum  likelihood  applied  to  the 
Poisson  binomial  distribution.  Biometrics  14,  97-106. 


